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Abstract 

We prove that every n-point metric space of negative type (and, in particular, every n- 
point subset of L{) embeds into a Euclidean space with distortion 0(\/\ogn ■ loglogn), a result 
which is tight up to the iterated logarithm factor. As a consequence, we obtain the best known 
polynomial-time approximation algorithm for the Sparsest Cut problem with general demands. 
Namely, if the demand is supported on a subset of size k, we achieve an approximation ratio of 
O(y/\ogk ■ log log k). 

1 Introduction 

Bi-Lipschitz embeddings of finite metric spaces, a topic originally studied in geometric analysis and 
Banach space theory, became an integral part of theoretical computer science following work of 
Linial, London, and Rabinovich [2Zj- They presented an algorithmic version of a result of Bour- 
gain |7] which shows that every n-point metric space embeds into L2 with distortion O(logn). This 
geometric viewpoint offers a way to understand the approximation ratios achieved by linear pro- 
gramming (LP) and semidefinite programming (SDP) relaxations for cut problems |271 13] . It soon 
became apparent that further progress in understanding SDP relaxations would involve improving 
Bourgain's general bound of O(logn) for n-point metric spaces of negative type. For instance, the 
approximation ratio achieved by a well-known SDP relaxation for the general Sparsest Cut problem 
is known to coincide exactly with the best-possible distortion bound achievable for the embedding 
of n-point metrics of negative type into L\ — a striking connection between pure mathematics and 
algorithm design. 

Further progress on these problems required new insights into the structure of metric spaces 
of negative type, and the design of more sophisticated and flexible embedding methods for finite 
metrics. Coincidentally, significant progress was made recently on both these fronts. Arora, Rao 
and Vazirani 4 a proved a new structural theorem about metric spaces of negative type and used 
it to design an 0(ylogn)-approximation algorithm for uniform case of the Sparsest Cut problem. 
Krauthgamer, Lee, Mendel and Naor [2^ introduced a new embedding method called measured 
descent which unified and strengthened many existing embedding techniques, and they used it to 
solve a number of open problems in the field. 

These breakthroughs indeed resulted in improved embeddings for negative type metrics; Chawla, 
Gupta, and Racke JO] used the structural theorem of [3] (specifically, its stronger form due to 
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Lee p2 ), in conjunction with measured descent to show that every n-point metric of negative type 
embeds into L2 with distortion O(logra) 3 / 4 . In the present work, we show how one can achieve 
distortion 0{\J\og n ■ log log n). This almost matches the 35-year-old lower bound of ylogn from 
Enflo ^3]- Our methods use the results of [3J|22jE]] essentially as a "black box," together with an 
enhancement of the measured descent technique. 

Recall that a metric space (X, d) is said to be of negative type if (X, \fd) is isometric to a subset 
of Euclidean space. In particular, it is well known that L\ is of negative type. (We also remind the 
reader that L2 is isometrically equivalent to a subset of L\.) The parameter C2(X), known as the 
Euclidean distortion of X, is the least distortion with which X embeds into Hilbert space, i.e. it is 
the minimum of distortion (/) = ||/||Lip" ||/ _1 HLip over all bijections / : X <— > L2. The mathematical 
investigation of the problem we study here goes back to the work of Enflo [E|> who showed that 
the Euclidean distortion of the Hamming cube = {0, l} d equals \fd = yd~og^~]^d[- The following 
natural question is folklore in geometric and functional analysis. 

"Is the discrete d- dimensional hypercube the most non-Euclidean 1 d -point subset of L\ ?" 

A positive answer to this question would imply that any n-point subset of L\ embeds in L2 with 
distortion 0{^/\ogn). In fact, motivated by F. John's theorem in convex geometry (see [3*2*]). 
Johnson and Lindenstrauss |19j asked in 1983 whether every n-point metric space embeds into 
L2 with distortion 0(\/log n). Here, the analogy between finite dimensional normed spaces and 
finite metric spaces is not complete: Bourgain [7] has shown that for any n-point metric space 
X, C2{X) = O(logn), and this result is existentially optimal |271 |SJ. By now we understand 
that finite metric spaces (namely expander graphs) can exhibit an isoperimetric profile which no 
Euclidean space can achieve, and this is the reason for the discrepancy with John's theorem. 
However, it is known (see |21| ) that several natural restricted classes of metrics do adhere to 
the O(vTogn) Euclidean distortion suggested by John's theorem. Arguably, for applications in 
theoretical computer science, the most important restricted class of metrics are those of negative 
type, yet improvements over Bourgain's theorem for such metrics have long resisted the attempts 
of mathematicians and computer scientists. 

The present paper is devoted to proving that up to iterated logarithmic factors, the answer 
to the above question is positive. This yields a general tool for the rounding of certain classes of 
semi-definite programs. As a result, we obtain the best-known polynomial time algorithm for the 
approximation of the Sparsest Cut problem with general demands, improving over the previous 
bounds due to ^Hl and the preceding works [21] and |Hj (which yield an O(logn) approximation). 
This problem is described in Section 11.11 We now state our main result. In the case of metrics 
of negative type (and not just L\ metrics), it answers positively (up to iterated logarithms) a 
well known conjecture in theoretical computer science and metric geometry (stated explicitly by 
Goemans in |16j). 

Theorem 1.1. Let (X,d) be an n-point metric space of negative type. Then 

c 2 (X) = O ^ y/log n • log log nj . 

Related work. Until recently, there was little solid evidence behind the conjecture that any n- 
point subset of L\ embeds in Hilbert space with distortion 0(^J\og n). In the paper [23], Lee, Mendel 
and Naor show that any n-point subset of L\ embeds into Hilbert space with average distortion 
0{y/\og n). Arora, Rao, and Vazirani 4. have shown that O(vlogn) distortion is achievable using a 
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different notion of average distortion, which turns out to be more relevant for bounding the actual 
distortion. As described above, combining their result with the measured descent technique of 
Krauthgamer, Lee, Mendel and Naor Chawla, Gupta, and Racke |TU] have recently proved 
that for any n-point metric space X of negative type, C2(X) = O(logn) 3 / 4 . It was conjectured |29| 
pg. 379] that n-point metrics of negative type embed into L\ with distortion O(l). Recently, Khot 
and Vishnoi [20] have obtained a lower bound of ri(log log n)" 5 , for some constant 5 > 0. 

Our results also suggest that the dimension reduction lower bound of Brinkman and Charikar [H] 
(see also |24| ) is tight for certain distortions. They show that embedding certain n-point subsets of 
L\ into if with distortion D requires that d > n^^ D \ Theorem ll.il together with theorems of 
Johnson and Lindenstrauss ^2] and Figiel, Lindenstrauss, and Milman yields an embedding 
of every n-point subset of L\ into £^ logn ) with distortion O (yTogn • log log n). 

1.1 Algorithmic application: The Sparsest Cut problem with general demands 

In this section, we briefly describe an application of Theorem ll.ll to the Sparsest Cut problem with 
general demands (and its relation to the multi- commodity flow problem). This is a fundamental NP- 
hard combinatorial optimization problem — we refer the interested reader to the articles 1261 121 1271 15]. 
the survey jHTJ) and Chapter 21 of the book |3B, for additional information on Sparsest Cut, and 
its applications to the design of approximation algorithms. 

Let G = (V, E) be a graph (network), with a capacity C(e) > associated to every edge e G E. 
Assume that we are given k pairs of vertices (s\, t\), (s&, £1^x1^ and D\, . . . , > 1. We 
think of the s, as sources, the tj as targets, and the value Di as the demand of the terminal pair 
(si,ti) for some commodity m. The problem is said to have uniform demands if every pair u,v £ V 
occurs as some (si,U) pair with Di = 1. 

In the MaxFlow problem the objective is to maximize the fraction A of the demand that can be 
shipped simultaneously for all the commodities, subject to the capacity constraints. Denote this 
maximum by A*. A trivial upper bound on A* is the cut ratio. Given any subset S C V, we write 

2 (S) = Eu V eEC(uv)-\ls(u)-l s (v)\ 
Ei=iDi-\ls(si)-lsik)\ ' 

where 1<j is the characteristic function of S. The value = mingcv ${S) is the minimum over all 
cuts (partitions) of V, of the ratio between the total capacity crossing the cut and the total demand 
crossing the cut. In the case of a single commodity (i.e. k = 1) the classical MaxFlow-MinCut 
theorem states that A* = $*, but in general this is no longer the case. It is known |261 1271 [3] 
that <5* = 0(log k)X*. This result is perhaps the first striking application of metric embeddings in 
combinatorial optimization (specifically, it uses Bourgain's embedding theorem [7J). 

Computing <I>* is NP-hard 30, . Moreover, finding a cut for which $* is (approximately) attained 
is a basic step in approximation algorithms for several NP-hard problems |26l l2~ll3T|. The best known 
algorithm for computing $* in the case of uniform demands is due to [I], where an approximation 
ratio of 0(^\og n) is achieved. In the case of general demands, an approximation ratio of 0(log A;) 3 / 4 
is obtained in ^U]. Here, as an application of Theorem ll.il we prove the following theorem: 

Theorem 1.2. Using the above notation, there exists a polynomial-time algorithm which produces 
a subset S CV for which 

${S) = O (Vlog£;-loglog k) $*. 
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Structure of the paper: This paper is organized as follows. In Section [21 we present an informal 
overview of the ideas involved in the proof of Theorem 11.11 Section is devoted to various prelim- 
inaries on the geometry of metrics of negative type. Theorem 11.11 is proved in Section |IJ and the 
algorithm of Theorem 11.21 is described and analyzed in Section El We end with Section H3 which 
contains additional remarks and open problems. 



2 Overview of the proof of Theorem 11.11 

Remarks on notation. When we write E > F for two expressions E and F, we intend this to 
mean that there exists some e > such that E > eF, where e is intended to be a universal constant, 
independent of the variables or parameters on which E and F depend. 

We will often work with Hilbert spaces of the following form: If H is a Hilbert space, and (f2, fi) 
is a probability space, we use L^iji-, ^, \i) to denote the Hilbert space of H- valued random variables 



Z with norm H^H^^n,^) = y E When H, $7 are clear from context, we simply write F^^i) 

and denote || • \ \h by || • ||2- 

Our proof of Theorem 11.11 has little to do with metrics of negative type; the connection to such 
spaces comes through the techniques of 01221 EE3 and is laid out in Sectional Instead, we present 
a general theorem about gluing together various maps from finite metric spaces into Hilbert spaces 
(and, more generally, L p spaces for p £ [1, oo)). Our starting point is the following type of ensemble. 

Let (X, d) be an n-point metric space. Suppose that for every r > 0, and every subset SCI, 
there exists a 1-Lipschitz map <ps T ■ X —> L2 with 

1 1 <ps,t 0) - <ps,t (y) 1 12 > / (l) 



whenever x,y £ S and d(x,y) € [r, 2t]. In general, y^og \ S\ could be a different function of IS*!, 
but we restrict ourselves here for simplicity. Additionally, let us temporarily define ip T = <px,r for 
every r > so that for the maps {<p T }, condition holds for all x,y £ X and |5| = n. The 
problem we are now confronted with is how to combine the ensemble of maps {(ps,r} together to 
obtain a genuinely bi-Lipschitz map. 

There is an obvious approach which comes to mind: Let R C Z be such that for all x, y € X, 
there exists k G R such that d(x, y) G [2 fc , 2 fc+1 ]. Now define the map ip : X — > L2 by cp = ©fc g /j ^>2 k - 
Clearly we have both ||<^||Lip < VR and, for all x,y E X, 

\Hx)-p(y)\\ 2 >^L, 
2vlogn 

hence distortion^) = 0{\JR logra). Trivially, we can choose R so that \R\ < n 2 . A slightly more 
delicate argument yields such an R with \R\ < 0(n). Unfortunately, we are searching for a bound 
of the form distortion (ip) ~ y^ogra, making this construction useless. 

Nevertheless, the key to a better gluing of the given ensemble does lie in the delicate interplay 
between the distributions of distances in X and the number of points in various regions of the 
space. The technique of measured descent from |2J| relies essentially on two facts about finite 
metric spaces. First, the identity 

\B(x,a- 2 k ) 1 



\B(x,2 k )\ 



O(lognloga) (2) 
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for any number a > 2. (In a fixed constant value of a was used, but for us the quantitative 
dependence is crucial, as we will have a depending on n.) This gives a simple bound on the rate 
that a finite metric space can expand over all its scales, and is implicitly used in earlier works under 
the name of "region growing" |26| I14j . 

For the purposes of this description, we will state the second fact less concretely. Basically, 
in certain settings, one can think of the ratio \B(x, ar)\/\B(x, r)| as the "local cardinality of the 
space" around x at scale r. As an example, if X = M. d , B(x, •) represents a Euclidean ball, and 
| • | is the Lebesgue measure, then this ratio approximates the number of r-net points that can be 
packed inside a ball of radius qt. Later, it will become necessary to randomly partition X into 
pieces of diameter at most 2r while ensuring that pairs x,y £ X with d(x, y) <C r are usually in the 
same component of the partition (see Section l3~T1 on padded decomposability) . It is known 0E] 
that the properties of such partitions near x depends on the local value log ^b{xt)\ ' 

Following this relationship is used in [22_ to prove (roughly) that, given the maps {tp T } T >o 
defined above, there exists ci map cp : X — > L2 such, that || 9^1 1 Lip 

< 0(\/logn) and, for x,y £ X 

with d(x,y) e [2 fc ,2 fc+1 ], 

\Wx) - <p(y)\\ 2 > Jlog '^l 2 ^!,' f llw(g) ~ Pa* Mil + ^,2^)| ) ( 3 ) 

log \B{x,2 k )\ / 

, ' 

(II) 

The contribution (II) comes from random partitioning and Rao's technique |36| . and is valid for 
any metric space X. Observing that (I) > d(x, y)/y/\ogn, and using AM-GM in ©, one arrives at 
the lower bound 




\\ip(x) - <p(y)\\ 



> 



d(x,y) 



2 . . 1 ) 



(log n) 
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hence distortion (tp) < 0(logn)i. While not obvious at present, the identity (|2j) is what allows 
to get the leading ^f- factor in © while keeping ||<yj||Lip small (see Theorem 14.5(1 . 

In order to get the distortion near 0(\/log n), we have to dispense with the contribution (II) 
which is not derived from the ensemble {ips, T }- Instead, we would like to pass from the ensemble 
Ws,t} to a family of maps {<p T ■ X — > L2} for which the contribution of (I) in (j2J) is replaced by 

\\^ 2k (x)-^ 2k {y)\\ 2 >^^t^^=. (4) 

V i0g [B(ai,2*)[ 

Clearly this would finish the proof. Roughly, the construction of (^ 2 fc proceeds as follows. We 
first randomly partition X into components of diameter about 2 k a for some appropriately chosen 
a = a(n). Writing the random partition as X = C\ U C 2 U • • • U C m , we then derive subsets Ci C C, 
by randomly sampling points from each Q. Then, we use an appropriately constructed (random) 
partition of unity to glue the collection of maps {c^. 2 fc}^i together. To ensure that the resulting 
map still has ||<^2 fe llLip < 0(1), the partition of unity is constructed carefully using properties of 
the random partition (this bears some resemblance to the technique of |25| for extending Lipschitz 
functions). 

The key to the proof is the way in which the random samples Ci are chosen. We have to 
maintain the property that C{ is a "good representative" of Cj at scale 2 k (i.e. we need that, on 
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average, Cj is a 2 fc -dense in d). On the other hand, we need to maintain the invariant that if 
x G Ci, then 

~ \B(x,a2 k )\ 
lQg|Q| " lQg |B(x,2*)| ' 

so that we can achieve a bound similar to (JIJ) (recall that the quality of the map tpg T depends on 
\S\ = \Ci\). Unfortunately, this is impossible since for distinct x,x' G Cj, the above ratios can be 
quite different. Instead, we have a number of phases, one for each estimate of the possible ratio 
(see the proof of Theorem ll.lj) . For this to work, we have to give up on achieving (JIJ) exactly, and 
instead we weave together the inter-scale ( Lemma 14.4(1 gluing of (|2J) with the intra-scale (Theorem 
I4.5|) gluing of (J3J) to obtain a nearly-tight bound of 0(^/Togn ■ log log n). 

3 Single scale embeddings 

In this section we present Theorem 13.11 and derive from it Lemma 13.51 which is one of the main 
tools used in the proof of the Main Theorem It is a concatenation of the result of Arora, Rao, 

and Vazirani 4 , its strengthening by Lee |22| . and the "reweighting" method of Chawla, Gupta, 

3 

and Racke ^U], who use it in conjunction with |21| to achieve distortion O (log re)*. For the sake 
of completeness, we present below a sketch of the proof of Theorem 13.11 Complete details can be 
found in the full version of [221; where a more general result is proved; the statement actually holds 
for metric spaces which are quasisymmetrically equivalent to subsets of Hilbert space, and not only 
for those of negative type. (See JH] for the definition of quasisymmetry; the relevance of such maps 
to the techniques of [I] was first pointed out in P4_). 



Theorem 3.1. There exist constants C > 1 and < p < \ such that for every n-point metric 

space (Y, d) of negative type and every A > 0, the following holds. There exists a distribution [i 

A_ 
16' 



over subsets U cy such that for every x,y G Y with d(x, y) > A 



H \ U : y G U and d(x, U) > - — 1 > p. 

y cyiog n j 

Proof (sketch). Let g : Y — > ii be such that 

d(x,y) = \ \g{x) - g{y)\\l 

for all x,y G Y. By there exists a map T : £2 — * (-2 such that ||T(z)||2 < V~K for all z G £2 and 

1^ iro-r(^)|| a ^ L 

2 min{\/A, | \z — z'\ I2} 

for all z, z' G £2- As in [22], we let / : Y — > M n be the map given by / = T o g (we remark that this 
map can be computed efficiently). Then / is a bi-Lipschitz embedding (with distortion 2) of the 



metric space \ Y, \J min{A, d}J into the Euclidean ball of radius A. 

Let < a < 1 be some constant. The basic idea is to choose a random u G S' n ~ 1 and define 



L u = [xeY: (x,u) <^#}, 
Ru = {xeY :(x,u)>^Y 
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One then prunes the sets by iteratively removing any pairs of nodes x G L u , y G R u with y) < 
A/y/logn. At the end one is left with two sets L' u , R' u . The main result of [31 \2'2\ is that with high 
probability (over the choice of u), the number of pairs pruned from L u x R u is not too large. 

Let 5a = {(x,y) G Y x Y : d(x,y) > •jg}. The reweighting idea of [El is to apply the above 
procedure to a weighted version of the point set as follows. Let W : Y xY —> Z + be an integer- valued 
weight function on pairs, with w(x,y) = w(y,x), w(x,x) = 0, and w(x,y) > only if (x,y) G 5a- 
This weight function can be viewed as yielding a new set of points where each point x is replaced 
by z~2 y eY w ( x i y) copies, with w(x, y) of them corresponding to the pair (x, y). One could think of 
applying the above procedure on this new point set; note that the pruning procedure above may 
remove some or all copies of x. Then, as observed in |10j . the theorems of [3J|22] imply that with 
high probability, after the pruning, we still have 

x£L' u ,yeR' u x,y 

The distribution [i mentioned in the statement of the theorem is defined using a family of 
O(logn) weight functions described below. Sampling from fj, consists of picking a weight function 
from this family and a random direction u G S n ~ l , and then forming sets L' U ,R' U as above using 
the weight function. Let us call these sets L' u {w) , R' u {w) . One then outputs the set U of all points 
x for which any "copy" falls into L' u (w). 

Now we define the family of weight functions. The initial weight function has wq(x, y) = n 4 for 
all (x,y) G 5a- Given W}~, obtain Wk+i as follows. If 

Li{u£ 5"- 1 : (x,y) G L' u (w k ) x R' u (w k )} > 0.1, 

we set Wk+i{x,y) = ^Wk(x,y). Otherwise, we set Wk+i(x,y) = Wk(x,y). A simple argument 
(presented in [TO]) shows that by repeating this O(logn) times we obtain O(logn) weight functions 
such that for every pair (x, y) G 5a the following is true: If one picks a random weight function 
w and a random direction u G 5 n_1 , then with constant probability we have (x,y) G L' u (w) x 
R' u (w). ' □ 



3.1 Padded decomposability and random zero sets 

Theorem 13. II is the only way the negative type property will be used in what follows. It is therefore 
helpful to introduce it as an abstract property of metric spaces. Let (X, d) be an n-point metric 
space. 

Definition 3.2 (Random zero-sets). Given A, £ > 0, and p G (0,1) we say that X admits a 
random zero set at scale A which is (^-spreading with probability p if there is a distribution ji over 
subsets ZCI such that for every x, y G X with d(x, y) > A, 

fi^ZQX: y G Z and d(x,Z) > > p. 

We denote by C(X;p) the least C > such that for every A > ; X admits a random zero set at 
scale A which is (-spreading with probability p. Finally, given k < n we define 

Ck(X;p) = max ((Y;p). 

\Y\<k 
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With this definition, Theorem 13. II implies that there exists a universal constant p £ (0, 1) such 
that for every n-point metric space (X,d) of negative type, £(X;p) = 0{^\og n). 

We now recall the related notion of padded decompos ability. Given a partition P of X and 
x £ X we denote by P{x) £ P the unique element of P to which x belongs. In what follows we 
sometimes refer to P(x) as the cluster of x. 

Definition 3.3 (Decomposition bundle, modulus of padded decomposability). Follow- 
ing \21\j we say that {Pa}a>o *s an a-padded decomposition bundle of a metric space X if for every 
A > ; Pa is a random partition of X (whose distribution we denote by v) with the following 
properties: 

1. For all P £ supp(z^) and all C £ P we have that diam(C) < A. 

2. For every x £ X we have that 

v{P : B(x,A/a) C P(x)} > -. 

The modulus of padded decomposability of X, denoted ax, is defined as the largest constant a > 
such that X admits an a-padded decomposition bundle. 

As observed in the results of [2315] imply that ax = 0(log|X|), and this will be used 
in the ensuing arguments. The following useful fact relates the notions of padded decomposability 
and random zero sets. Its proof is motivated by an argument of Rao 

Fact 3.4. ((X;l/8) < ax- 
Proof. Fix A > and let P be a partition of X into subsets of diameter less than A. Given x £ X 
we denote by Ttp(x) the largest radius r for which B(x, r) C P(x). Let {ecjceP De i-i-d- symmetric 
{0, l}-valued Bernoulli random variables. Let Zp be a random subset of X given by 

Z P = (J C . 

C*SP: e c =0 

If x, y £ X satisfy d(x, y) > A then P(x) ^ P(y)- It follows that 

Pr[y £ Zp A d{x, Z P ) > tt p {x)] > -. 

By the definition of ax , there exists a distribution over partitions P of X into subsets of diameter 
less than A such that for every x £ X with probability at least 1/2, ttp(x) > A/ax- The required 
result now follows by considering the random zero set Zp. □ 

We end this section with the following simple lemma, which shows that the existence of random 
zero sets implies the existence of embeddings into L2 which are bi-Lipschitz on a fixed distance 
scale. 

Lemma 3.5 (Random zero sets yield single scale embeddings). For every finite metric space 
X, every S C X every p £ (0, 1) and every r > 0, there exists a 1-Lipschitz mapping (p : X — > L2 
such that for every x,y £ S with d(x, y) > t, 

IM*)-*)I| 2 >^|. 
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Proof. By the definition of C(S,p) there exists a distribution [i over subsets Z C S such that for 
every x,y £ S with d(x, y) > r, 

// jz C S : y £ Z and Z) > ^ | > p. 

Define if : X — > ^(a 4 ) by (/?(x) = d(x,Z). Clearly 93 is 1-Lipschitz. Moreover, for every x,y £ S 
with y) > t, 

\ 2 

r 



□ 



4 Proof of Theorem 11.11 

The primary result of this section is the following theorem. 

Theorem 4.1. Let (X,d) be an n-point metric space. Suppose there exist constants C > and 
\ < e < 1, such that for every r > ; and every subset S C X, i/iere exists a 1-Lipschitz map 
(ps,r '■ X ^ L2 with 

\\vsA*)-wM\\*> 

whenever x,y £ S and d(x, y) £ [r, 6r]. TTien C2 (X) < 0(1) • C(logn) e log log n. 

Theorem 14. II implies Theorem ll.il Indeed, if X is an n point metric space such that for some 
p £ (0, 1), e £ [1/2, 1], and C > 0, we have for every k < n, Qk(X;p) < C(log/c) e , then Theorem 14. II 
together with Lemma 13.51 imply that 

'C(logn) £ loglogn 

c 2 ( A ) — u 



Theorem 11.11 follows since by Theorem 13.11 we know that for some universal constant p £ (0, 1), if 
J is a metric space of negative type then for all k, (k(X;p) = O (Vlog n) . 

The proof of Theorem H. II will be broken down into several steps. In what follows we fix a finite 
metric space X, and for K > 1, r > 0, define 



S T {K) = < x £ X : \B(x,8ra x )\ < K 



B x 



l2C{\ogKf 

Lemma 4.2 (Embedding neighborhoods). Let S C X, r > ; and assume that there exists a 
1-Lipschitz map ip : X — ► L2 satisfying 

Mx)-<p(v)\\2>j 

for x,y £ S, d(x,y) £ [t/2,3t] and some L > 2. Then there is a 1-Lipschitz map h : X — > L2 with 

\\h(x) - h(y)\\ 2 > ^ 
whenever d(x,S) < wr, y £ X, and d(x,y) £ [r, 2r]. 
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Proof. Define g : X — > R by g(x) = d(x, S), and set h = -^((p (B g). If d(y, S) > then 

p(x) - %)|| 2 > -L\\g{x) - g(y)\\ 2 > -^{d{y,S) - d(x,S)) >^~- 
Otherwise, let x',y' £ S be such that d(x,x') < -^j-,d{y,y') < -ff, and observe that 



r T T T T ' 

d(x',y')£ d(x,y) , d(x,y) -\ 1 

v ,y j i \ ,yj gL 3L , v ,uj gL 3L 



C 



r 

-,3r 
2' . 



Using our assumptions on cp, we have 

H^-^lb^H^O-^OIb-IMlLip 

hence \ \h(x) - h(y)\\ 2 >^-^. 



f T T 

V6L + 3L 



> 



2L 



□ 



Lemma 4.3 (Random subsets). Assume that X satisfies the conditions of Theorem \4 H and 
suppose that [/CI and k > 2. Define 



T T (U;k) = \x £ U : \U\ < k 



B(x, 



12C(log ky 

Then there exists a 1-Lipschitz map ^ij.k '■ X — > L 2 such that 

\hu,k( x ) - iu,k{y)\\2 > 



(5) 



30C(log ky 



whenever x £ T T (U; k),y £ X and d(x, y) 6 [r, 2r]. 

Proof. Let 5 be a uniformly random subset 5 C [f with IS"! = min{|C/|, A;}. Let /is : X — > L 2 be 
the map defined by h$ = -j^{<Ps,t/2 © s) where g(x) = d(x, S). Define ju,k '■ X — > L 2 (L 2 , [i), where 
^, is the distribution of the random subset S, by ^u,k(x) = hs(x) (recall that hs(x) is a random 
element of L 2 ). Note that ju^ is 1-Lipschitz because the same is true for each hs- 

Let L = 2C(log \S\) e . Observe that, by the definition of T T (U; k), with probability at least 1/e, 
we have 



SHB x 



6LJ 



sdb(x, 



12C(log£;) e 



Assuming this holds, we see that d(x,S) < ^j-. Thus by Lemma fOl ||/is(x) — hs(y)\\ 2 > ^r- It 



follows that 



1 r 

\lU,k{x) - 7£/,fc(y)|| 2 > -p • 777 > 



v/e 9L ~ 30C(log£;) £ 



□ 



In what follows we shall use the fact that for every r > there exists a mapping G T : L 2 — > L 2 
such that for every x,y £ L 2 , 

[|Gy(x)|| 2 = ||G T (y)|| 2 = r and \ min{r, ||x - y|| 2 } < ||G r (x) - G T (y)|| 2 < min{r, \\x - y\\ 2 }. (6) 

The existence of G T is precisely Lemma 5.2 in |31| . As in j^j, we will use the map G T to control 
the Lipschitz constant of various functions under partitions of unity. 



10 



Lemma 4.4 (Localization). Assume that X satisfies the conditions of Theorem \4-l\ Then for 
every r > 0, k > 1, i/iere exists a 1-Lipschitz map A T ^ : X — ► L2 sitc/i i/iai /or every x G S T (k),y G 
X with d(x,y) G [r, 3t], 

IK, fc (x)-A r , fc (y)|| 2 > 240 ^ ogfc)E . 

Proof. Let D = 4r«x and take Pd to be a random partition from the ax-padded bundle ensured 
by Definition 13.31 Define a random mapping p : X — ► R by 

. f d(z,X\P D (z)) ) 
p(z) = mm < 1, > . 

Clearly 1 1 /? 1 1 Lip < 1/ T - F° r each U G Pd, let 7^ be the corresponding map from Lemma [4,31 
Finally, define a random map A T & : X — » L2 by 

Ar,*(«) = 5/£»(«) ■ ; YPb(*),fc(«) J 

where for / : X —* L2 we write / = G T o /, where GV is as in ©. 

We claim that ||A Ti fc||Li p < 1. Indeed, fix it, v G X. If Pd(u) = Pd(v) = U then 

\\K,k{u) - K,k{v)\\2 < \\p{u) - p(v)\ ■ ||7p d (u),*(«)||2 + 2 1 -%,fc(«)||2 • 

< 2( r llpllLiP + \\%,khip) d ( u > v ) 

< d(u,v). 

Otherwise, assume that Pd(u) 7^ Pd{v). In particular, 

d(u,v) > m&x{d(u,X\P D (u)),d(v,X\P D (v))}. 

It follows that 

\\A T , k (u) - A Tik (v)\\ 2 < ||A T) fc(«)|| 2 + ||A Tj fc(t;)||a 

^ d(u,X\P D (u)) d(v,X\P D (v)) 

< • r H • r 

2r 2r 

< d(u,v). 

Now suppose that x G S T (k),y G X, and d(x, y) G [r, 3r]. Observe that since diam(Prj(x)) < D, 
we have Prj(x) Q B(x,2D). It follows that since x G S T (k), we have x G T t (Pd(x); k) (recall 
equation (JSJ)). Moreover, using the defining property of the ax-padded bundle, with probability at 
least |, we have d(x,X \ Pd(x)) > 5r. Since we are assuming that d{x,y) < 3r, this implies that 
p(x) = p(y) = 1. It follows that 

E||A T ,fc(x) - A r ,fe(y)||2 > 2 " 2 E ll7p c ( a; ),fc(^) -7p D (x),fc(y)l|2 

> |E(min{||7 Pl3(:E)ifc (x) - 7 p D(:r)ifc (y)||2,r}) 

^ r 

~ 240C(log kf ' 

Denoting by (f2, /i) the probability space on which A T) k is defined, we can think of A T ^ as a mapping 
of X into the Hilbert space LiiLiiiA which has the required properties. □ 
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The following theorem is a generalization of the Gluing Lemma in 22 . In particular, it is 
important for us that part (2) treats x and y symmetrically, unlike in |22| . 

Theorem 4.5 (Inter-scale gluing). Given any n-point metric space (X, d) and constants A,B> 
1, and for every m £ Z, a 1-Lipschitz map (j> m : X — » L%, there exists a map tp : X — > L 2 which 
satisfies 



1. |M|iip < 0(Vlogralog(AB)). 

2. For every x, y £ X we have 

\\<p(x) - <p(y)\\ 2 > max - 



\B{x,2 m + l A)\ 



log 



\B{x,2 m /B)\ 



mm 



t (x) - 4> m {y)\\i 



Proof. Let p : X — > R + be any 25-Lipschitz map with p = 1 on [1/5, 2 A], and p = outside 
[1/2J3,4A|. For x e X and t > 0, define 

R(x,t) = sup{i? : |J5(s,il)| < 2 4 }, 

and observe that R(-,t) is 1-Lipschitz for every value of t. And for each m £ Z, define 

R(x,ty 



Write 



& : X - £ 2 (L 2 ), 



2 m /B <Pmi 



Pm,t( x ) = Z 9 



where G 2 ™/b is as m ©• Now, for each f £ {1,2,..., [log 2 n]}, define 



Tfit(x) = (J) Prn,t( X ) • 4>m{x). 



Finally, let 93 = ipi © ip 2 © • • • © ^[i og2 n] • 
First, we bound 1 1 V 7 * 1 1 Lip as follows. 

\\tpt{x) - ipt(y)\\l = E \\Pm,t{x)4m{x) - p m ,t{y)<i>m{y)\\l- 

(x)+ P m,t(y)>0 

The number of non-zero summands above is at most O (log A + log B). Furthermore, each summand 
can be bounded as follows. 



\Pm,t(x)4>m(x) - p m ,t{y)(t>m{y)\\2 < (y)\ 



(X)\2 + 



,{x) - (p m (y)\\2 ■ \Pm,t(y)\ 



< 



\Pm,t 1 1 Lip ' n 
JD 



Lip d(x,y) 



< 4d(x,y). 



Thus ||V>t||Lip < 0( v Tog(AB)). It follows that |M| L i P < 0( v Togrnog(AB)), as claimed. 

It remains to prove the lower bound. To this end, fix m £ Z, x, y £ X and observe that if 
Pm,t(x) = 1, then 



(x) - 1pt(y)\\2 > \\<t>m{x) ~ 4>m(y)\\2 - (1 - Pm,t(y)) ' \\4>m{y)\\2 

1 f 2 m 1 2 m 

> o mill< l ■^■»HAn(«) - 0m(y)||2 \ - — ■ (1 - Pm,t(y))- 



B 



(7) 
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On the other hand 

|hk(z) - Mv)\\2 > \\M*)\\2 - \\My)\\2 = U m (x)h - Pm,t(v)\\Mv)h = ^^ l ~ p™M)- (») 

Averaging (J7J) and (JHJ) we get that 

1 f 2 m 1 

||^)-^t(y)||2 > 4 min { 5- ,\\<t>m{x) - <f) m {y)\\ 2 \ . (9) 

Hence it suffices to count the number of values of t for which p m ,t{x) = 1. By our definitions we 
have that 

2 m 

Pm,t(x) = l <=^> — <R(x,t)<2 m+1 A ^ t€ [log \B(x, 2 m /B)\,log \B(x,2 m+1 A)\}. 



, \B(x,2^A)\ 
1U & \B{x,2 m /B)\ 



values of t. □ 



This completes the proof since the lower bound ® holds for 

We also present the following base case. 

Claim 4.6 (Small ratios). Let X be an n-point metric space and r, A > 0. Define the subset 

S x = {x G X : \B(x,r)\ < X\B(x,t/2)\}. 

Then there exists a 1-Lipschitz map F : X — > L 2 such that if x G S\ and y G X with d(x, y) > r, 
then 

\\F(x)-F(y)\\>-^L. 

Vlogn 

where e(A) > is a constant depending only on A. 

Proof. For each t G {1, 2, . . . , [log re]}, let Wt Q X be a random subset which contains each point 
of X independently with probability 2 . Let gt(x) = min{<i(x, Wt), t/A} and define the random 
map / = -7^=^ (51 © • • • ©5[lognl) so that ||/||Li P < 1- Finally we define F : X -» L 2 (jti) by 

-F(x) = /(x), where is the distribution over which the random subsets {Wt} are defined. 

Now fix x G Sa and let t G N be such that 2* < |B(x,r/2)| < 2 m . Let £ far be the event 
{d(x,Wt) > t/4} and let S c \ ose be the event {d(x,Wt) < t/8}. Clearly both such events are inde- 
pendent of the values {gt{z) : d(x, z) > r} (this relies crucially on the use of min{-,r/4} in the 
definition of gt). In particular, these events are independent of the value gt(y)- It follows that 

\\F(x) - F( y )f L2M = EjfW-mWl 

1 2 

— — E M \g t (x) -g t {y)\ 2 
log n 



> : E„\g t (x) - g t (y)\t 

T 2 

> min { Pr(£ far ), Pr(£ dose )}. 



~ logn 

Finally, we observe that Pr(£f ar ) and Pr(£ c i ose ) can clearly be lower bounded by some e(A) > 0. 

□ 

We are now in position to conclude the proof of Theorem 14.11 
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Proof of Theorem \4-l\ We claim that for every K G [2,n] there exists a map fx ■ X — > L2 which 
satisfies 

1- ll/xlkip < 0(Vlogn • log log n). 

2. For every m G Z and x G S2m(K),y G X we have 

|S(x,2 m+3 a x )| 



> 



log 



B(x,2 m /[12C(log iT) £ ] 



i2m 



C 2 (logA') 2e 



Indeed, is obtained from an application of Theorem 14.51 to the mappings {A2m ^-} me ^ from 
Lemma 14.41 with A = Aatx and B = 12C(log K) £ (and using the fact that ax = O(logre)). 

Observe that for every m G Z, ^^(n) = X. Hence, defining Ao = n and ifj+i = y/Kj, as long 
as ifj > 4, we obtain mappings /o, ■ ■ ■ > /j : A — > L2 satisfying 

1- ||/j||Lip < 0(Vlogn • log log n). 

2. For all x G S 2 ^{Kj) \ S 2 ™{K j+1 ) and y G X such that d(ar,y) G [2 m ,2 m+1 ] and we have 



> 



> 



> 



log 
log 



\B(x,2 m+3 a x ] 



B (x, 2 m / [12C (log KjY] 

\B(x,2 m+3 a x )\ 
B(x,2™/[12C(logK j+1 ) 
d(x,y) 2 



2 z, 



2m. 



C^logAj) 25 
' C 2 (logAT.,) 2e 



C^logi^) 2 ^ 1 ' 



C 2 (logA,) 2£ 



(10) 

(11) 
(12) 



where in (jlUjl we used the fact that Kj+\ < Kj and d(x,y) < 2 m+1 , in (jllj) we used the fact that 
£ ^ 52"i(Aj +1 ), and in (fT2*|) we used the fact that Aj +1 = \[Kj > 2. 

This procedure ends after N steps, where N < O (log log n). Every x G S2^{K^) satisfies 

\B{x,2 m+3 a x )\ < 4\B(x,2 m /[12C})\. 

By Claim l4~ffl there is a mapping /jv+i : X — > L2 which is Lipschitz with constant 0(Vlog n) and 
for every x,y G S 2 m(K N ), \\f N+1 (x) - /jv+i(y)||2 > d(x,y). 

Consider the map $ = (J)^) 1 fj, which is Lipschitz with constant O (\/log n • log log n). For 
every x, y G X choose m G Z such that <i(x, y) G [2 m , 2 m+1 ]. If x, y G # 2 m (A^) then 

Mx) - d>(y)|| 2 > H/jv+iCx) - /Ar+i(y)|| 2 > d(x,y). 

Otherwise, without loss of generality there is j G {0, . . . , N — 1} such that x G 5 2 m (Kj)\S2^(Kj + i), 
in which case by (fT2|) 



|<&(z) -<&(y)|| 2 > ||/ j+ i(x)-/ j+ i 



> 



d(x,y) 



> 



d{x,y) 



C(logn) 



□ 
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5 The sparsest cut problem with general demands 



This section is devoted to the proof of Theorem ll.2l Our argument follows the well known approach 
for deducing the algorithmic Theorem 1 1 . 21 from the embedding result contained in Theorem ll . II (see 

e.g. (2Z10IIE1)- 

5.1 Computing the Euclidean distortion 

In this section, we remark that the maps used to prove Theorem 11.11 have a certain "auto- 
extendability" property which will be used in the next section. We also recall that it is possible to 
find near-optimal Euclidean embeddings using semi-definite programming |27j . 

Corollary 5.1. Let (Y,d) be an arbitrary metric space, and fix a k-point subset X C Y. If the 
space (X, d) is a metric of negative type, then there exists a 1-Lipschitz map f : Y — > L2 such that 
the map f\x '■ X — > L2 has distortion O (y'log k • log log k) . 

Proof. We observe that the maps used to prove Theorem II. If i.e. those produced in Lemma 13.51 
and Claim 1*0)1 are of Frechet-type. In other words, there is a probability space (Q,fi) over subsets 
Auj C X for uj £ f2, and we obtain a maps (ps ;T : X — > L2{p) given by ips lT (x)(uj) = d(x,A UJ ). We 
can then define the extension cps >T ■ Y — > 1,2(11) by 

<PS,T(y)(u) = d(y,A u] ). 

Thus by extending the ensemble of maps {tps,r} to the larger space Y before the application of 
Theorem 14. 1( we can ensure that the final embedding is 1-Lipschitz on 7. □ 

Now we suppose that (Y, d) is an n-point metric space and X C Y is a fc-point subset. 

Claim 5.2. There exists a polynomial-time algorithm (in terms of n) which, given X and Y, 
computes a map f : Y — > L2 such that f\x has minimal distortion among all 1-Lipschitz maps f . 

Proof. We give a semi-definite program computing the optimal /, which can be solved in polynomial 
time using the methods of |17j . 





SDP 


(5.1) 




max 


£ 






s.t. 


x u e II 


l n 








a^Hl ^ d(u, v) 2 


Vu, v G Y 




|| ^tt 


x v \\\ > ed(u,v) 2 


Vu, v E X 



□ 



5.2 The Sparsest Cut 

Let V be an n-point set with two symmetric weights on pairs wn,W£) : V xV — > M + (i.e. wn(x, y) = 
wn(u,x) and WD(x,y) = wd{v,x)). For a subset S C V, we define the sparsity of S by 

, (q . _ Eu & S,v£V\S W N(u,v) 

^w N ,w D ) — j N ) 

Eu & s,vcv\s w d(u,v) 
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and we let $>*(V,wn,wd) = mingcy $ WN<WD (S). (The set V is usually thought of as the vertex 
set of a graph with wn(u,v) supported only on edges (u, v), but this is unnecessary since we allow 
arbitrary weight functions.) 

Computing the value of $*(V, wn, wd) is NP-hard (HOI- The following semi-definite program is 
well known to be a relaxation of <J>*(V, wjy, wd) (see e.g. [TH]V 



SDP (5.2) 

min T,u,v£V WN(U,V) \\X U ~ X v \\l 

s.t. i u £l" Vu G V 

T,u,veV w D(u,v)\\x u -x v \\l = l 

I I /y /y> I I ^ I I ^y, ~, I I 2 I I I /-y* I I 2 

Vn, U, 10 G V 



Furthermore, an optimal solution to this SDP can be computed in polynomial time |171 I16| . 

The algorithm. We now give our algorithm for rounding SDP (5.2). Suppose that the weight 
function wd is supported only on pairs u, v for which u,v G U C V, and let k = \U\. Denote 
M = 20 log n. 

1. Solve SDP (5.2), yielding a solution {x u } u ^v ^ 

2. Consider the metric space (V,d) given by d(u,v) = \\x u — ar v |||. 

3. Applying SDP (5.1) to U and (V,d) (where Y = V and X = U), compute the optimal map 
/ : V — ► M n . 

4. Choose 0\, ... , Pm G { — 1, +1}™ independently and uniformly at random. 

5. For each 1 < i < M, arrange the points of V as v\, . . . , v l n so that 

(Pi, /(«})> < (A, /(«}+i)> for each 1 < j < n - 1. 

6. Output the sparsest of the Mn cuts 

(W,...,<}>{<+1> •••,<}), l<ro<n-l, l<z<M. 

Claim 5.3. Wii/i constant probability over the choice of Pi, ... , Pm, the cut (S, V \ S) returned by 
the algorithm has 

<S> WN , WD (S) < O (TIoiHoglogA;) $*(V, w N , w D ). (13) 

Proof. Let S C M. n be the image of V under the map /. Consider the map g : S — > given by 
#(:c) = ((Pi, x) , . . . , (Pm , %))■ It is well-known (see, e.g. 0|32]) that, with constant probability 
over the choice of C S 1 ™ -1 , g has distortion 0(1) (where S is equipped with the Euclidean 

metric). In this case, we claim that ()13|) holds. 

To see this, let Si, S%, . . . , Smu Q V be the Mn cuts which are tested in line (6). It is a standard 
fact [17| that there exist constants at\, Oi2, ■ ■ ■ , ctMn > such that for every x, y G V, 

Mn 

\\g(f(x)) - g(f(y))\\i = ^2,onp Si (x,y), 
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where pSi(x,y) = 1 if x and y are on opposite sides of the cut (Si,V \ Si) and psi(x,y) = 
otherwise. 

Assume (by scaling) that g o f : Y — ► if 1 is 1-Lipschitz. Let A be the distortion of g o /. By 
Corollary 15.11 A = (y^IogA; log log fc). Recalling that wd(u,v) > only when u,v £ U, 



®*(V,w N ,w D ) > 



> 



Yn,veV w N(u,v)\\Xu ~ Xv\\l 
Y,u,veU W D{u,v)\\x u - X v \\l 

1 Y.u,v&v w N(u,v)\\g(f(u)) - g{f{v))\\i 
A Y, u ,v& w d{u,v) \\g(f(u)) - g(f(v))\\i 

1 Ej^i g; Eu,« e y waK". ")^Si j 
A E*S a i J2u,veu w d(u, v)p Si (u, v) 

. 1 . E« ) «6V W ^( U ' U )^ i (tt,f) 
> — nun ; 

A 4 E u ^ec/ U, ^( u '' i; )^ l (w,w) 

&w N ,w D (S) 

A 

This completes the proof. □ 



6 Concluding remarks 

• There are two factors of O (\/log log n) which keep our bound from being optimal up to a 
constant factor. One factor of ^log log n arises because Theorem 14.51 is applied with A, B ~ 
polylog(n). The need for such values arises out of a certain non- locality property which 
seems inherent to the method of proof in _4_. We remark that achieving A = 0(1) is probably 
possible, and it seems that B is the difficult factor. 

The other factor arises because, in proving Theorem ll.il we invoke Theorem l4.5l for 0(log log n) 
different values of the parameter K. It is likely removable by a more technical induction, but 
we chose to present the simpler proof. 

• It is an interesting open problem to understand the exact distortion required to embed n- 
point negative type metrics into L\. As mentioned before, the best known lower bound is 
ri(loglogn) 5 |2Uj . We also note that assuming a strong form of the Unique Games Conjecture 
is true, the general Sparsest Cut problem is hard to approximate within a factor of (log log n) 

puiig. 

• For the uniform case of Sparsest Cut, it is possible to achieve a O (\/log n) approximation in 
quadratic time without solving an SDP jSj . Whether such an algorithm exists for the general 
case is an open problem. 

• There is no asymptotic advantage in embedding n-point negative type metrics into L p for 
some p £ (l,oo), p / 2 (observe that since is isometric to a subset of L p for all p > 1, 
our embedding into Hilbert space is automatically also an embedding into L p ). Indeed, for 
1 < p < 2 it is shown in [23] that there are arbitrarily large n-point subsets of L\ that 
require distortion f2 I yip — 1) log n J in any embedding into L p . For 2 < p < oo it follows 
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from |351 I33j that there are arbitrarily large n-point subsets of L\ whose minimal distortion 



into L p is 1 + f y -^^J (the dependence on n follows from [S5], and the optimal dependence 
on p follows from the results of [SSI)- Thus, up to multiplicative constants depending on p 
(and the double logarithmic factor in Theorem II .1|) . our result is optimal for all p G (l,oo). 

• Let (X,dx),(Y,d,Y) be metric spaces and r\ : [0, oo) — ► [0, oo) a strictly increasing function. 
A one to one mapping / : X Y is called a quasisymmetric embedding with modulus n if for 
every x, a, b G X such that x ^ b, 



We refer to |18| for an account of the theory of quasisymmetric embeddings. Observe that 
metrics of negative type embed quasisymmetrically into Hilbert space. It turns out that our 
embedding result generalizes to any n point metric space which embeds quasisymmetrically 
into Hilbert space. Indeed, if (X, d) embeds quasisymmetrically into Li with modulus n then, 
as shown in the full version of |22| . there exists constants p = p(n) and C = C(n), depending 
only on rj, such that (,{X;p) < C^/logn. 
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